Learning from Concept Drifting Data Streams with Unlabeled Data

نویسندگان

  • Pei-Pei Li
  • Xindong Wu
  • Xuegang Hu
چکیده

Contrary to the previous beliefs that all arrived streaming data are labeled and the class labels are immediately available, we propose a Semi-supervised classification algorithm for data streams with concept drifts and UNlabeled data, called SUN. SUN is based on an evolved decision tree. In terms of deviation between history concept clusters and new ones generated by a developed clustering algorithm of k-Modes, concept drifts are distinguished from noise at leaves. Extensive studies on both synthetic and real data demonstrate that SUN performs well compared to several known online algorithms on unlabeled data. A conclusion is hence drawn that a feasible reference framework is provided for tackling concept drifting data streams with unlabeled data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An adaptive ensemble classifier for mining concept drifting data streams

Traditional data mining techniques cannot be directly applied to the real-time data streaming environment. Existing mining classifiers therefore need to be updated frequently to adopt the changes in data streams. In this paper, we address this issue and propose an adaptive ensemble approach for classification and novel class detection in concept-drifting data streams. The proposed approach uses...

متن کامل

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams

In a typical data stream classification task, it is assumed that the total number of classes are fixed. This assumption may not be valid in a real streaming environment, where new classes may evolve at any time. Traditional data stream classification techniques are not capable of recognizing novel class instances until the appearance of the novel class is manually identified, and labeled instan...

متن کامل

Boosting classifiers for drifting concepts

This paper proposes a boosting-like method to train a classifier ensemble from data streams. It naturally adapts to concept drift and allows to quantify the drift in terms of its base learners. The algorithm is empirically shown to outperform learning algorithms that ignore concept drift. It performs no worse than advanced adaptive time window and example selection strategies that store all the...

متن کامل

An Ensemble Classifier for Drifting Concepts

This paper proposes a boosting-like method to train a classifier ensemble from data streams. It naturally adapts to concept drift and allows to quantify the drift in terms of its base learners. The algorithm is empirically shown to outperform learning algorithms that ignore concept drift. It performs no worse than advanced adaptive time window and example selection strategies that store all the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neurocomputing

دوره 92  شماره 

صفحات  -

تاریخ انتشار 2010